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Abstract 

We introduce a covariance matrix estimator that both takes into account the 
heteroskedasticity of financial returns (by using an exponentially weighted 
moving average) and reduces the effective dimensionality of the estimation 
(and hence measurement noise) via techniques borrowed from random matrix 
theory. We calculate the spectrum of large exponentially weighted random 
matrices (whose upper band edge needs to be known for the implementa- 
tion of the estimation) analytically by a procedure analogous to that used 
for standard random matrices. Finally, we illustrate, on empirical data, the 
superiority of the newly introduced estimator in a portfolio optimization con- 
text over both the method of exponentially weighted moving averages and the 
uniformly-weighted random-matrix-theory-based filtering. 
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1 Introduction 



Covariance matrices of financial returns play a crucial role in financial theory and also 
in many practical applications. In particular, financial covariance matrices are the key 
input parameters to Markowitz's classical portfolio selection problem pQ which forms 
the basis of modern investment theory. For any practical use of the theory, it would 
therefore be necessary to obtain reliable estimates for the covariance matrices of real- 
life financial returns (based on historical data). It was clear from the very outset that 
the estimation problem of such matrices suffers from the " curse of dimensions" : if one 
denotes by N the number of assets and by T the length of the time series used for 
estimation, one has to estimate 0(N 2 ) parameters from a sample of O(NT) historical 
returns, and usually the condition T ^> N cannot be fulfilled in realistic financial 
applications. For finite N and T, with N large and T bounded for practical reasons 1 , 
the estimation error of the covariance matrix can become so overwhelming that the 
whole applicability of the theory becomes questionable. 

This difficulty has been well known for a long time, see e.g. Ref. [2| and the nu- 
merous references therein. The effect of estimation noise (in the covariance matrix of 
financial returns) on the solution of the classical portfolio selection problem has been 
extensively studied, see e.g. Ref. jSj. The general approach to reducing this estimation 
noise has been to impose some structure on the covariance matrix in order to decrease 
the effective number of parameters to be estimated. This can be done by using several 
methods. For example, various "models" have been introduced on "economic" grounds, 
such as single and multi-index models, grouping by industry sectors or macroeconomic 
factor models (see e.g. the numerous references in [2 ). Alternatively, "purely statisti- 
cal" covariance estimation methods have been used too, such as principal component 
analysis or Bayesian shrinkage estimation [3] . Several studies compare the performance 
of (some of) these covariance estimation procedures (in the framework of classical port- 
folio optimization problem), see e.g. Ref. [3]. The general conclusion of all these studies 
is that reducing the dimensionality of the problem by imposing some structure on the 
covariance matrix may be of great help in reducing the effect of measurement noise in 
the optimization of portfolios. 

The problem of noise in financial covariance matrices has been put in a new light by 
the findings of Ref. [Hj and the following Refs. [HIHIIE!, obtained by the application of 
random matrix theory. These studies have shown that correlation matrices determined 
from financial return series contain such a high amount of noise that, appart from 
a few large eigenvalues and the corresponding eigenvectors, their structure can be 
regarded as random (in the example analyzed in Ref. P 94% of the spectrum could be 
fitted by that of a purely random matrix). The results of Refs. jUlE] not only showed 
that the amount of estimation noise in financial correlation matrices is large, but also 
provided the basis for a technique that can be used for an improved estimation of such 

1 Typically one wants to consider several hundreds of assets and has available daily financial data 
over a period of a couple of years at most. 
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correlation matrices. A "filtering" procedure based on "eliminating" those eigenvalues 
and eigenvectors of the empirical correlation matrix that correspond to the noise band 
deduced from random matrix theory has been introduced in Refs. jHl El and found to 
be very effective in reducing the estimation noise in the portfolio optimization context. 

In this paper we introduce a covariance matrix estimator that combines the filtering 
procedure based on random matrix theory (that seeks to attenuate estimation noise by 
reducing the effective dimensionality of the covariance matrix) with the technique of 
exponentially weighted moving averages of returns (that tries to take into account the 
heteroskedasticity of volatility and correlations, a salient feature of real-life financial 
returns). We show that this estimator can be very powerful in constructing portfolios 
with better risk characteristics. In particular, it seems that by taking into account the 
non-stationarity of the time-series, the estimator can outperform the standard random 
matrix theory-based filter (where returns from a given time window are uniformly 
weighted). Most remarkably, the spectrum of exponentially weighted random matrices 
(whose upper band edge needs to be computed for the practical implementation of 
the estimation procedure) can be computed analitically in a certain limiting case. 
Concerning another aspect of the use of empirical financial covariance matrices, we 
argue that even though matrices obtained by simple exponential weigthing (without 
any noise filtering) can be successful for determining the risk of a given portfolio, their 
use in the context of portfolio optimization can be very dangerous. 



2 A Covariance Matrix Estimator Based on Exponential Weighting and 
Random Matrix Theory 

Suppose we have a sample of (say, daily) returns of N financial assets over a given 
period of time. Let us denote by the return of stock i (i = 1,2,..., N) at time 
t — k, with t the last point of time in the available data (k = 0,1, . . .). A simple 
and widely used estimator for the covariance matrix of returns is the "sample" or 
"historical" matrix: 

1 

7p y / -Eik^jk) (1) 
1 fc=0 

where T is the length of the sample. Under the assumption that the distribution of 
returns is Gaussian, this is also the maximum likelihood estimator which is known 
to perform well in the limit N = fixed, T — > oo. However, in finite length samples, 
especially when N is large, the estimation noise (measurement error) can become sig- 
nificant. Estimation methods that can reduce this measurement error have long been 
in the focus of attention of academics and practitioners alike. The root of the prob- 
lem is evident: N time series of length T each do not contain sufficient information 
to allow the 0(N 2 ) matrix elements to be reliably estimated unless T ^> N, which 
hardly ever occurs in a banking context. Since the length of the time series is, for 
obvious reasons, rather limited (of the order of a few years, i.e. in cases with daily data 
N ~ 1000 at most), the only conceivable solution is to reduce the effective number of 
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dimensions of the problem. Over the years several techniques have been introduced 
and succesfully applied that reduced the estimation error through a shrinking of di- 
mensions. One of the latest of these techniques was inspired by results from random 
matrix theory. It consists of " cleaning" the covariance matrix by retaining only those 
components (eigenvalues and corresponding eigenvectors) that are outside the noise 
band that corresponds to the spectrum of a purely random matrix (see Refs. jHUHj)- It 
has been demonstrated empirically in Refs. [HI Ej arid subsequently via simulations in 
Ref. ^0] that this technique is indeed very powerful for reducing the estimation noise 
of covariance matrices used in standard (mean-variance) portfolio optimization. 

However, it is well known that financial returns exhibit heteroskedastic volatility 
and correlations (i.e. the random processes generating the returns are not stationary). 
Accordingly, a large part of the financial academic literature has focused on modelling 
the dynamics of the covariance of financial returns 2 . These ideas have also found their 
way into industrial practice: in the early 1990's J. P. Morgan and Reuters introduced 
RiskMetrics [12 j , a methodology for the determination of the market risk of portfolios. 
RiskMetrics has soon become the most widely used method for measuring market risk 
and it is now considered a benchmark in risk management. At the heart of the method 
lies the estimation of the covariance matrix of returns ("risk factors") through an 
exponentially weighted moving average 3 : 

1 — a T ~ l 

^ij = i t 5-/ a X ik x jki (2) 

1 ~ a k=0 

where the normalization factor can be approximated by 1 — a if T is large enough. This 
method has been found to be very successful in estimating the market risk of given, 
fixed portfolios. 

In contrast, if this covariance matrix estimate is used for portfolio optimization (i.e. 
for selecting the portfolio in a mean-variance framework, which involves the inversion 
of the matrix), the estimation error will be quite large for typical values of the ratio 
T/N (see Ref. [HI]). In the case of exponential weighting, the results in Ref. [TU] 
imply that the degree of suboptimality will depend on the ratio of the effective time 
length — 1/loga and the number of assets N. In particular, since the effective time 
corresponding to the value of the exponential decay factor a suggested by Ref. [12] 
(a = 0.94 for daily data) is shorter than the length of the time windows used in a typical 
standard (uniformly weighted) covariance matrix estimation, it can be expected that 
for the same portfolio size N the effect of noise (suboptimality of optimized portfolios) 
will be larger with exponential weighting than without it. Nevertheless, dimension 
reduction techniques based on random matrix theory (developed in Refs. [HI E] for 

2 The most widely used approach to modelling the heteroskedasticity of financial returns is via 
ARCH/GARCH processes, see e.g. Ref. JT] for a review. 

3 Thc idea of the method is that old data become gradually obsolete, therefore they should con- 
tribute less to the estimates than more recent information. The exponential weighting is consistent 
with GARCH, see Ref. |T2]. 
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uniformly weighted matrices) can be expected to be useful in reducing the effect of 
noise also in the case of exponential weighting. 

The purpose of this paper is to introduce an estimator that, by the adaptation of 
the filtering procedure of Refs. [SJ E] to exponentially weighted matrices, can reduce the 
estimation noise of the covariance matrix used for portfolio optimization (while it will 
still be able to take account of the non-stationarity of financial returns). The usefulness 
of this procedure for portfolio optimization will also be illustrated. In fact, since the 
spectrum of exponentially weighted purely random matrices of the form of Eq. (j2J) (with 
Xik iid random variables) will be seen to be qualitatively similar to that of the standard 
(uniformly weighted) random matrices, the same filtering procedure can be applied in 
both cases. The only difference lies in the value of the upper edge of the noise spectrum. 
Therefore, in order to apply the filtering procedure, one has to know the value of the 
upper edge of the noise spectrum of an exponentially weighted purely random matrix 
for a given N and a. This can be determined for each given N and a by Monte Carlo 
simulation, but most remarkably, in the limit of N — > oo, a — *■ 1, N(l — a) = fixed it 
is possible to obtain the full spectrum as the solution to a set of analytical equations, 
as shown below. 



3 The Spectrum of Exponentially Weighted Random Matrices 

The derivation of the spectrum of exponentially weighted random matrices follows the 
steps and notation of the standard (Wishart) case in Ref. [THj which is itself based on 
Ref. [H|. In the limit of an infinite window, the exponentially weighted random matrix 
is given by 

oo 

C ij = XX 1 ~~ a)a k x ik x jk , (3) 

fc=0 

where Xik is assumed to be Gaussian iid with zero mean and standard deviation a. 
One can rewrite 

oo 

Cy = X HikHjk (4) 
fc=0 

with Ha- having a fc-dependent variance o\ = <r 2 (l — a)a k . 

Following Ref. [T3] . we can use the resolvent technique to write the density of 
eigenvalues as the imaginary part of the derivative of a log-partition function: 



P(A) 



1 -Im-^Z(A), 



Ntx dA 



(5) 



with 



Z{\) = -2 log / exp 



N 



N oo 



Z i=\ Z i,j=l k=0 



n ( d(Pi 



= 1 \V27T, 



(6) 



We can now average Z(\) over the random matrix H^. To keep the derivation 
simple, we will average the argument of the log rather than average the log. As in 
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the standard case, one can use the formal device of the replica trick and show that 
the result is indeed self-averaging. The i^-dependent term can be averaged using a 
standard Gaussian integral: 



cxp 



j N oo 
Z i,j=l k=0 



N \ - V2 
2 I 



(7) 



k=0 



i=l 



TV 



= exp -- £ log 1 - a 2 (l - a)o* £ ^ 2 (S) 



fe=0 



i=i 



We then introduce q = a 2 {I — a) which we fix using an integral representation 
of the delta function: 

1 



5 (q - a 2 {\ - a) £ V 2 ) = J — exp [i((q - a\l - a) £ y. 2 )] dC 
After performing the integral over the f^s and writing z = 2i£(l — a), we find: 

Aff) Aoo roo r AT / 

Z{\) = -2\og^- j_^j_j W [--(\og(X-a 2 z) + 
— £log(l-a fc g) +Qqz) dqdz, 



(9) 



k=0 



(10) 



where Q = 1/(N(1 — a)) measures the "quality" of the estimation as the ratio of the 
decay time of the exponential weighting to the number of assets. 

This is where the main difference with the Wishart case arises. The term Q log(l— q) 
is replaced by 

1 oo 

'111 



1 oo 

F Q (q) = --J2log(l-a k q), 

k=0 



which we need to compute in the N — > oo, a — > 1 limit with Q = 1/(N(1 — a)) fixed. 
We start by expanding the log in a Taylor series about 1 



F Q (q) 



1 ^^g'a" 



iV k=oe=i 
1 ™q E 



= -y* 



Q/Ny 



Taking the N — > oo limit we find 



oo I 

F Q (q)=Q^:% = QF(q), 



(12) 
(13) 

(14) 



where F(q) is the hypergeometric function with the property F'(q) = — log(l — q)/q. 

We can now perform the integrals over z and q using the saddle point method, 
leading to the following equations: 

- log(l - q) 



Qq = 



a 



\-a 2 z 



and z = 



(15) 



Here we are saved by the fact that only the derivative (— log(l— q)/q) of the function 
F(q) appears. 

To find the density we need to differentiate Eq. (fTUj) with respect to A. Since we do 
not have explicit expressions for q(X) and z(X) at the saddle point, it is important to 
realize that partial derivatives with respect to these variables are zero. 

We find: 

dZ N NQq(X) 



dX X-a 2 z(X) a 2 ' 
We can now use Eq. (jSJ) to find the density of eigenvalues: 



(16) 



= Qlm[g(X)} _ 
%a 2 

Because Eqs. ()15|) are transcendental, we cannot find an explicit form for p(A), never- 
theless it is straightforward to write p(A) as the zero of a single equation which can be 
solved numerically. We find p(A) = Qv/ir where v is the solution of 

\ - -^rv + log(w 2 ) - logsin( W A) - Q- 1 = 0. (18) 
o l tan(t>A) 

The solution p(X) for a given Q looks fairly similar to p(X) of the standard (Wishart) 
case 4 . This is illustrated in Fig. ^] where we plotted the spectrum of the exponentially 
weighted random matrix with Q = l/(iV(l — a)) = 2 and the spectrum of the standard 
random matrix with Q = T/N = 3.45 (for which the upper edges of the two spectra 
coincide). It can be clearly seen from the figure that the two curves run quite close to 
each other. 

The spectrum obtained in the limit of N — > oo, a —* 1 with Q = 1/ (N(l — a)) fixed 
can be compared with the distribution of eigenvalues for finite N. Fig. El shows the 
spectrum of the exponentially weighted random matrix with Q = 1/ (N(l — a)) = 2 in 
the limit of iV — > oo and the histogram of eigenvalues for one realization of the matrix 
with the same value of Q for finite N = 400. It can be seen that the fit is quite good 
already for a single realization of the matrix. 

4 Portfolio Optimization Results and Discussion 

In order to test the performance of the covariance matrix estimator in the context of 
portfolio optimization, we consider the simplest version of the classical (mean-variance) 
portfolio optimization problem: the portfolio variance Y%,j=i w i w j ^ s minimized 
under the budget constraint Y2=i w i = 1> where Wi denotes the weight of asset i in the 
portfolio and the covariance matrix of asset returns. In this case, the weights of 
the "optimal" (minimum variance) portfolio are simply 

v-vn /^i— 1 

< = ir 1 Zi - (19) 

2^j,k=l ^jk 



1 For simplicity, in the following analysis we consider a = 1. 
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Fig. 1: Spectrum of the exponentially weighted random matrix with Q = 1/(N(1 — a)) = 2 
and the spectrum of the standard random matrix with Q = T/N = 3.45. 




Fig. 2: Spectrum of the exponentially weighted random matrix with Q = 1/(JV(1 — a)) = 2 
in the limit N — > oo and the histogram of eigenvalues for one realization of the matrix for 
finite iV = 400. 
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By eliminating all additional sources of uncertainty (such as, for example, expected 
returns that are notoriously hard to estimate) stemming from the determination of 
different other parameters appearing in more complex formulations, this form provides 
an extremely convenient framework in which to test the efficiency of different covariance 
matrix estimators as inputs for portfolio optimization (see Ref. |lUj). 

We assess the performance of several covariance matrix estimators based on the out- 
of-the-sample performance of the portfolios constructed using the covariance matrices 
provided by the estimators. For this purpose, we take a sample of financial returns (e.g. 
daily returns on stocks), and we divide it into an estimation ("past") period and an 
evaluation ("future") period. We calculate different correlation matrix estimates based 
on returns only from the first period and we use them to construct "optimal" portfolios 
(as given by Eq. (jl9)l ). Finally, we evaluate the performance of the estimators based on 
the standard deviation of the corresponding portfolio returns in the second period. In 
order to reduce the error that might arise from the use of a single sample, we perform 
our experiments on a large number of samples bootstrapped from a larger dataset of 
daily stock returns. More precisely, starting from the same dataset of 1306 daily returns 
on 406 large-cap stocks traded on the New York Stock Exchange as used in Refs. [SUB], 
for several values of iV (number of assets) and T 2 (the length of the evaluation period), 
in each iteration, we select at random N assets and a period of time starting from the 
beginning of the dataset and ending at a random point in time (in the last third of the 
sample in order to have an estimation period of sufficient length). The last T 2 data 
points of this bootstrapped sample are used for evaluation, while the rest of the sample 
for estimation. 

We consider several methods for estimating covariance matrices. We calculate "his- 
torical" estimates based on uniformly weighting the returns within a time-window of 
length Ti (different estimates for different values for T\): 

C^ Tl = W X>^> (2°) 

J i k=0 

where denotes the return on asset i at time t — k, with t being the last point of 
the estimation period. We also calculate "historical" estimates based on exponentially 
weighting of returns (different estimates for different values for the decay- factor a): 

CS« a =(l-«) T £«W- (21) 

k=0 

where T s denotes the length of the estimation period (which is chosen so that a Ts <C 1). 
In addition, we consider covariance estimators based on "filtering" the historical (uni- 
formly and exponentially weigthed) matrices. For each historical matrix we consider 
two versions of filtering: one, based on the largest eigenvalue 5 , and the other, based 
on the eigenvalues above the noise band of the corresponding random matrix. Let us 

5 Such procedure is consistent with the "single index" or "market" model widely used by practi- 
tioners. 
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denote these by C™' eq ' 1 , c^' exp ' a and by C[j e9 ' 1 , Clj exp ' a , respectively. To summarize, 
for each value of T% we have three estimators based on uniformly weighting the returns, 
and for each value of a we have three estimators based on exponential weighting. 

In what follows, we compare the performance of these estimators for different values 
of N (number of stocks) and T2 (length of "investment period"). The criteria used for 
comparison is the ex-post volatility (i.e. the volatility in the "investment period") of 
the minimum variance portfolios constructed by using the estimators based on ex ante 
return data (i.e. before the investment period). The volatility measures are obtained 
by averaging over a large number of bootstrapped samples obtained from the dataset 
of daily stock returns. The results for iV = 100 and T2 = 20 (investment period of one 
month) are presented in Fig. El 

It can be seen from the figure that the random-matrix-theory-based filtering per- 
forms the best for both uniform and exponential weighting. It is interesting to note 
that in the case of uniform weighting the best choice for the length of the time window 
T\ is around 250, i.e. one year of (daily) data. In the case of exponential weighting 
the best choice of the parameter a is around 0.996, which corresponds to an effective 
time length of — 1/loga of around 250 again. By comparing the two minima, one can 
see that the estimator based on exponential weighting performs (slightly) better. This 
shows that combining techniques that take into account the volatility and correlation 
dynamics of time series (e.g. exponential weighting) with techniques that reduce the 
effective dimensionality of the correlation matrix (e.g. random matrix theory-based fil- 
tering) can provide covariance matrix estimates that lead to optimized portfolios with 
better risk characteristics. 

The historical estimators can perform quite well if enough data points are taken 
into account (i.e. T% or a is large enough). For uniform weighting it seems that 2 
years of daily data can be enough (for N = 100 assets!). One important point to note, 
however, is that covariance matrices obtained using the RiskMetrics ^2] method, i.e. 
exponentially weighted historical estimate with a = 0.94, are completely inappropriate 
for portfolio optimization with a larger number of assets. For example, for N = 100 
the volatility of the optimal portfolio obtained by using this matrix is around 16 (annu. 
%), well above the values presented in Fig. El Even for a = 0.96 — 0.97 as used by 
many practitioners, the volatility value is above 12 (annu. %). Therefore, although 
RiskMetrics has been found very useful in estimating the market risk of portfolios, 
the results above suggest that its direct use for portfolio optimization with a larger 
number of assets may be completely misleading. As a matter of fact, this seems to 
have been realized by practitioners, who advocate e.g. the use of larger a [15 or of 
principal component analysis JH] (which, in view of our results, can be interpreted as 
increasing the effective time length or decreasing the dimensionality of the problem, 
respectively) . 

It is interesting to note that single-index estimators perform better when a smaller 
number of data points is used for the estimation. The reason for this could be that the 
fewer data points are used, the more correlation dynamics can be taken into account, 
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7.4- 



? foO 150 200 250 300 350 400 450 500 

T 1 




7.4- 



7.2 1 1 1 1 1 1 

0.97 0.975 0.98 0.985 0.99 0.995 1 

a 

FlG. 3: Ex post volatility (annual %) of optimal portfolios constructed using correlation 
matrix estimators based on (top) uniform weighting, (bottom) exponentially weighting the 
return data, as a function of (top) the length T\ of the time window, (bottom) the decay 
factor a used in the weighting, (h), (m), (r) denote the results obtained in the case of (h) 
historical/sample estimate, (m) single-index/market model estimate and (r) estimate using 
random matrix theory-based filtering, respectively. 
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while the loss of the estimation precision for the largest eigenspace is probably smaller. 

Simulations for longer "investment periods" (larger T 2 ) showed very similar results. 
For more assets (larger N) results are similar, although the effectiveness of historical 
estimates decreases further. However, for fewer assets (e.g. N = 50) historical esti- 
mates perform better and can compete with estimates based on random matrix theory 
filtering. 

5 Conclusion 

We have introduced a covariance matrix estimator that takes into account the het- 
eroskedastic nature of return series and reduces the effective dimension of portfolios 
(hence measurement noise) via techniques borrowed from random matrix theory. We 
have demonstrated its superiority to both the method of exponentially weighted mov- 
ing averages and the uniformly-weighted random-matrix-theory-based filtering. We 
have found that a too strong exponential cutoff will waste too many data, while a 
weak cutoff will wash away the non-stationary nature of the time series. The optimal 
attenuation factor, corresponding to the best balance between these two extremes, was 
found to be higher than the value suggested by RiskMetrics. 
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